Search CORE

Copenhagen University Research Information System

Sapientia

Generalized Spring Tensor Model: A New Improved Load Balancing Method in Cloud Computing

Author: A Blondel
G Gregorcic
H Kaya
HK Nakamura
L Xing
Q Zhang
T Hamelryck
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Significant characteristics of cloud computing such as elasticity, scalability and payment model attract businesses to replace their legacy infrastructure with the newly offered cloud technologies. As the number of the cloud users is growing rapidly, extensive load volume will affect performance and operation of the cloud. Therefore, it is essential to develop smarter load management methods to ensure effective task scheduling and efficient management of resources. In order to reach these goals, varieties of algorithms have been explored and tested by many researchers. But so far, not many operational load balancing algorithms have been proposed that are capable of forecasting the future load patterns in cloud-based systems. The aim of this research is to design an effective load management tool, characterized by collective behavior of the workflow tasks and jobs that is able to predict various dynamic load patterns occurring in cloud networks. The results show that the proposed new load balancing algorithm can visualize the network load by projecting the existing relationships among submitted tasks and jobs. The visualization can be particularly useful in terms of monitoring the robustness and stability of the cloud systems. © Springer International Publishing Switzerland 2015

OPUS - University of Technology Sydney

A generative angular model of protein structure evolution

Author: García-Portugués E
Golden M
Hamelryck T
Jotun H
Mardia KV
Sorensen M
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

Recently described stochastic models of protein evolution have demonstrated that the inclusion of structural information in addition to amino acid sequences leads to a more reliable estimation of evolutionary parameters. We present a generative, evolutionary model of protein structure and sequence that is valid on a local length scale. The model concerns the local dependencies between sequence and structure evolution in a pair of homologous proteins. The evolutionary trajectory between the two structures in the protein pair is treated as a random walk in dihedral angle space, which is modeled using a novel angular diffusion process on the two-dimensional torus. Coupling sequence and structure evolution in our model allows for modeling both “smooth” conformational changes and “catastrophic” conformational jumps, conditioned on the amino acid changes. The model has interpretable parameters and is comparatively more realistic than previous stochastic models, providing new insights into the relationship between sequence and structure evolution. For example, using the trained model we were able to identify an apparent sequence–structure evolutionary motif present in a large number of homologous protein pairs. The generative nature of our model enables us to evaluate its validity and its ability to simulate aspects of protein evolution conditioned on an amino acid sequence, a related amino acid sequence, a related structure or any combination thereof

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Copenhagen University Research Information System

Oxford University Research Archive

Universidad Carlos III de Madrid e-Archivo

White Rose Research Online

Repositorio Institucional da Universidade de Santiago de Compostela

Potentials of Mean Force for Protein Structure Prediction Vindicated, Formalized and Generalized

Understanding protein structure is of crucial importance in science, medicine and biotechnology. For about two decades, knowledge based potentials based on pairwise distances -- so-called "potentials of mean force" (PMFs) -- have been center stage in the prediction and design of protein structure and the simulation of protein folding. However, the validity, scope and limitations of these potentials are still vigorously debated and disputed, and the optimal choice of the reference state -- a necessary component of these potentials -- is an unsolved problem. PMFs are loosely justified by analogy to the reversible work theorem in statistical physics, or by a statistical argument based on a likelihood function. Both justifications are insightful but leave many questions unanswered. Here, we show for the first time that PMFs can be seen as approximations to quantities that do have a rigorous probabilistic justification: they naturally arise when probability distributions over different features of proteins need to be combined. We call these quantities reference ratio distributions deriving from the application of the reference ratio method. This new view is not only of theoretical relevance, but leads to many insights that are of direct practical use: the reference state is uniquely defined and does not require external physical insights; the approach can be generalized beyond pairwise distances to arbitrary features of protein structure; and it becomes clear for which purposes the use of these quantities is justified. We illustrate these insights with two applications, involving the radius of gyration and hydrogen bonding. In the latter case, we also show how the reference ratio method can be iteratively applied to sculpt an energy funnel. Our results considerably increase the understanding and scope of energy functions derived from known biomolecular structures

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Copenhagen University Research Information System

Calibur: a tool for clustering large numbers of protein decoys

Author: C Bystroff
D Shortle
H Li
KS Arun
KT Simons
MR Betancourt
S Boris
S Wu
SC Li
Shuai Cheng Li
T Hamelryck
Y Zhang
Yen Kaow Ng
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Ab initio protein structure prediction methods generate numerous structural candidates, which are referred to as decoys. The decoy with the most number of neighbors of up to a threshold distance is typically identified as the most representative decoy. However, the clustering of decoys needed for this criterion involves computations with runtimes that are at best quadratic in the number of decoys. As a result currently there is no tool that is designed to exactly cluster very large numbers of decoys, thus creating a bottleneck in the analysis. Results Using three strategies aimed at enhancing performance (proximate decoys organization, preliminary screening via lower and upper bounds, outliers filtering) we designed and implemented a software tool for clustering decoys called Calibur. We show empirical results indicating the effectiveness of each of the strategies employed. The strategies are further fine-tuned according to their effectiveness. Calibur demonstrated the ability to scale well with respect to increases in the number of decoys. For a sample size of approximately 30 thousand decoys, Calibur completed the analysis in one third of the time required when the strategies are not used. For practical use Calibur is able to automatically discover from the input decoys a suitable threshold distance for clustering. Several methods for this discovery are implemented in Calibur, where by default a very fast one is used. Using the default method Calibur reported relatively good decoys in our tests. Conclusions Calibur's ability to handle very large protein decoy sets makes it a useful tool for clustering decoys in ab initio protein structure prediction. As the number of decoys generated in these methods increases, we believe Calibur will come in important for progress in the field.</p

p3d – Python module for structural bioinformatics

Author: AK Ghose
AV Morozov
BA Grzybowski
C Fufezan
C Negron
Christian Fufezan
DF Savage
F Henry
G Wang
HM Berman
IK McDonald
JJ Perona
Michael Specht
MR Starich
O Gursky
R Grunberg
S Hovmoller
T Hamelryck
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background High-throughput bioinformatic analysis tools are needed to mine the large amount of structural data via knowledge based approaches. The development of such tools requires a robust interface to access the structural data in an easy way. For this the Python scripting language is the optimal choice since its philosophy is to write an understandable source code. Results p3d is an object oriented Python module that adds a simple yet powerful interface to the Python interpreter to process and analyse three dimensional protein structure files (PDB files). p3d's strength arises from the combination of a) very fast spatial access to the structural data due to the implementation of a binary space partitioning (BSP) tree, b) set theory and c) functions that allow to combine a and b and that use human readable language in the search queries rather than complex computer language. All these factors combined facilitate the rapid development of bioinformatic tools that can perform quick and complex analyses of protein structures. Conclusion p3d is the perfect tool to quickly develop tools for structural bioinformatics using the Python scripting language.</p

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Prediction of binding hot spot residues by using structural and evolutionary parameters

Author: Ahmed S
Altschul SF
Apweiler R
Arkin MR
Ban YA
Bogan AA
Bradford JR
Bradford JR
Chang CC
Clésio Luis Tozzi
Cristianini N
Darnell SJ
DeLano WL
Duda RO
Eisenberg D
el-Deiry WS
Fawcett T
Fernández-Recio J
Frishman D
Guney E
Hagerty CG
Hamelryck T
Hanley JA
Hastie T
Higa RH
Higgins D
Hu Z
Jones S
Kato S
Kidera A
Kirsch T
Koenderink JJ
Kortemme T
Li X
Liang J
Ma B
McIvor AM
Moreira IS
Neuvirth H
Platt J
Pupko R
Reddi AH
Res I
Roberto Hiroshi Higa
Rost B
Wesson L
Yuan C
Publication venue: Sociedade Brasileira de Genética
Publication date: 01/01/2009
Field of study

In this work, we present a method for predicting hot spot residues by using a set of structural and evolutionary parameters. Unlike previous studies, we use a set of parameters which do not depend on the structure of the protein in complex, so that the predictor can also be used when the interface region is unknown. Despite the fact that no information concerning proteins in complex is used for prediction, the application of the method to a compiled dataset described in the literature achieved a performance of 60.4%, as measured by F-Measure, corresponding to a recall of 78.1% and a precision of 49.5%. This result is higher than those reported by previous studies using the same data set

Repositorio da Producao Cientifica e Intelectual da Unicamp

Structure of a lectin from Canavalia gladiata seeds: new structural insights for old molecules

Author: A Deacon
A Vargin
A Williams
AGW Leslie
Alexandre H Sampaio
AW Schuettelkopf
Beatriz T Freitas
Benildo S Cavada
Bruno AM Rocha
CAD Gadelha
DE McRee
EJM Van Damme
Emmanuel P Souza
FBMB Moreno
Frederico BMB Moreno
GA Rosenthal
GM Edelman
Gustavo A Bezerra
HM Berman
J Bouckaert
J Bouckaert
J Ton
JB Taylor
JH Naismith
JJ Calvete
K Boege
L Taiz
LIMM Da Silva
P Rozan
PJ Lea
Plínio Delatorre
PR Evans
R Loris
SJ Hamodrakas
T Swain
Taianá M Oliveira
Tatiane Santi-Gadelha
TW Hamelryck
VM Ceccatto
Walter F Azevedo
WJ Peumans
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Lectins are mainly described as simple carbohydrate-binding proteins. Previous studies have tried to identify other binding sites, which possible recognize plant hormones, secondary metabolites, and isolated amino acid residues. We report the crystal structure of a lectin isolated from <it>Canavalia gladiata </it>seeds (CGL), describing a new binding pocket, which may be related to pathogen resistance activity in ConA-like lectins; a site where a non-protein amino-acid, α-aminobutyric acid (Abu), is bound. Results The overall structure of native CGL and complexed with α-methyl-mannoside and Abu have been refined at 2.3 Å and 2.31 Å resolution, respectively. Analysis of the electron density maps of the CGL structure shows clearly the presence of Abu, which was confirmed by mass spectrometry. Conclusion The presence of Abu in a plant lectin structure strongly indicates the ability of lectins on carrying secondary metabolites. Comparison of the amino acids composing the site with other legume lectins revealed that this site is conserved, providing an evidence of the biological relevance of this site. This new action of lectins strengthens their role in defense mechanisms in plants.</p

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Queen's University Belfast Research Portal

BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server

Author: Bing Xiong
Jie Wu
David L Burk
Mengzhu Xue
Hualiang Jiang
Jingkang Shen
WA Warr
A Kouranov
A Godzik
OC Redfern
SG Buchanan
K Lundstrom
DF Veber
D Lee
SF Altschul
A Bateman
BE Engelhardt
J Soding
C Chothia
L Holm
AG Murzin
CA Orengo
A Andreeva
TA Binkowski
GJ Kleywegt
RA Laskowski
RB Russell
S Schmitt
A Shulman-Peleg
AC Wallace
T Hamelryck
M Ashburner
P Willett
HM Berman
GP Brady
WR Pearson
A Gutteridge
T Fawcett
ND Gold
J Blaszczyk
K Yeturu
RA Laskowski
L Xie
MP Liang
M Brylinski
XY Jiang
D Pal
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Genome sequencing and post-genomics projects such as structural genomics are extending the frontier of the study of sequence-structure-function relationship of genes and their products. Although many sequence/structure-based methods have been devised with the aim of deciphering this delicate relationship, there still remain large gaps in this fundamental problem, which continuously drives researchers to develop novel methods to extract relevant information from sequences and structures and to infer the functions of newly identified genes by genomics technology. Results Here we present an ultrafast method, named BSSF(Binding Site Similarity & Function), which enables researchers to conduct similarity searches in a comprehensive three-dimensional binding site database extracted from PDB structures. This method utilizes a fingerprint representation of the binding site and a validated statistical Z-score function scheme to judge the similarity between the query and database items, even if their similarities are only constrained in a sub-pocket. This fingerprint based similarity measurement was also validated on a known binding site dataset by comparing with geometric hashing, which is a standard 3D similarity method. The comparison clearly demonstrated the utility of this ultrafast method. After conducting the database searching, the hit list is further analyzed to provide basic statistical information about the occurrences of Gene Ontology terms and Enzyme Commission numbers, which may benefit researchers by helping them to design further experiments to study the query proteins. Conclusions This ultrafast web-based system will not only help researchers interested in drug design and structural genomics to identify similar binding sites, but also assist them by providing further analysis of hit list from database searching.</p

Southampton (e-Prints Soton)